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Abstract. We can think of a lensed quasar as taking the Hubble time, shrinking it by ~ 10~", and then presenting the result 
to us as a time delay; the shrinking factor is of the order of fractional sky-area that the lens occupies. This cute fact is a 
straightforward consequence of lensing theory, and enables a simple rescaling of time delays. Observed time delays have a 
40-fold range, but after rescaling the range reduces to 5-fold. The latter range depends on details of the lens and lensing 
configuration — for example, quads have systematically shorter rescaled time delays than doubles — and is as expected from a 
simple model. The hypothesis that observed time-delay lenses all come from a generalized-isothermal family can be ruled out. 
But there is no indication of drastically different populations either. 
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1. Introduction 

Most of the observables in gravitational lensing (image posi- 
tions and magnifications) are intrinsically dimensionless. The 
exception is the time delay between images, which takes its 
dimensionality straight from the universe: 1 At oc Hq 1 . This re- 
markable fact is the essential reason for much research effort 
going into measuring time delays. The observations have been 
increasingly successful — in 1995 there was but one controver- 
sial time delay, currently there are nine non-controversial ones. 
These are summarized in TableJ^below. 

But curiously, even as the image and time delay data have 
improved, the error bars on the inferred Ho have not. As 
an example, consider 0957+56 1 . Between Ku ndic et al. (1997)| 
and |Oscoz et al. (2001)1 the time-delay value changed by 
only 2%. But meanwhile, whereas Kundic ~t al. (1997)1 quote 
Hq = 64 + 13 (95% confidence) in the usual units of 
kms -1 Mpc" 1 , [Bernstein & Fischer (19 99) with more imaging 
and more modelling conclude that the data imply only 77*??, 
while |Keeton et air (2000) assert that further data on the lensed 
host galaxy invalidates all previously published models, and 
they decline to give an Ho estimate at all. Basically, the prob- 
lem is that simple lens models are unable to fit the images to the 
mas-level demanded by current data, while more complicated 
models can fit the data but are non-unique and can produce 
identical observables from very different values of Ho. 

Modellers have responded to this dilemma with two strate- 
gies. One is to try to identify simple models that both have 
enough parameters to fit or nearly fit the data and can 

1 This point appears to have been first emphasized by 
|Nityananda (199071 although it is implicit already in Refsdal (1964) 



be justified on galactic-structure grounds; Kochanek (2003) 
is typical of these. The other strategy is to try to ex- 
plore the space of all plausible models allowed by the data; 
|Rayc haudhur y et al. (2003)] is a recent example. For a re- 
view by authors representing different points of view see 
ICourbin et al. (2003)| 

In the current context of good data and active modelling 
but no consensus on models, it is interesting to step back 
and pose some questions that tend to get obscured in the de- 
tails of modelling. First, we can think of the purpose of mod- 
elling time-delay lenses as being to discover one dimension- 
less number, the factor relating At and Hq 1 . What contribu- 
tions to this number are well-constrained and what are poorly 
constrained? What range of values do the data imply for the 
poorly-constrained part? Is that range systematically different 
for doubles and quads, and/or for isolated lensing galaxies ver- 
sus interacting galaxies? And is that range consistent with what 
we expect from popular models? Nine systems is a small sam- 
ple, but it is enough to provide preliminary answers to these 
questions, and to do so is the aim of this paper. 

2. A scaling relation for time delays 

In lensing theory the arrival time can be written as 

K0) = (i+z L )^jj[||0 -p\ 2 -^e)\ (i) 

where the symbols have their usual meanings. For convenience, 
let us abbreviate this expression. First, we write t(6 ) for the 
expression inside square brackets. The factor outside square 
brackets equals H ( y 1 times a dimensionless distance factor D 
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(say) that depends on redshifts and (weakly) on cosmology, but 
not on Ho; for small zl and large zs, D zl(1 + Zl)- We note 
further that only differences in arrival-time between images are 
observable. Hence observable time delays have the form 

Af = H 1 D At (2) 



Table 1. Summary of time-delay data. 



We expect that At will be of the same order as \8 - j3\ 2 
but a few times smaller, the precise value depending on details 
of lens and lens configuration. For an observed lens we might 
predict 

At ~ (0! + 8 2 f (3) 

where 8\ , 82 are the 8 values of the first and last images to arrive. 
To focus attention on the proportionality factor, I propose to 
consider the dimensionless quantity 

At 

w = . (4) 

^(01+02) 2 

We can calculate <p from a lens model, but not directly from 
observations. We can, however, measure a related quantity, a 
scaled time delay 

Af 



AT = 



(5) 



±(0i +e 2 ) 2 D 

directly from observations, and substituting equations and 
@ we see that 

AT = ipHj. (6) 

The factor -k is ad hoc, but it allows the following inter- 
pretation. Recall that the image separation in a galaxy lens is 
about twice the Einstein radius: 

8i + 9 2 - 28 E . (7) 

For an isothermal, the relation is exact. But even for non- 
circular lenses, where 6e is not strictly defined, the image con- 
figuration can be used to define an effective 8e- Using Q the 
denominator in @ is nd^ / (47r), i.e., the area of the Einstein ring 
as a fraction of the sky. In other words, if we scale the observed 
time delay by the lens's covering factor on the sky we get H Q 1 
times a 'fudge factor' of the order of unity. 

For isothermal lenses, <p ranges from to 8, averaging y. 
To see this, recall that for isothermals, At = 28e/3 and note that 
P could be anywhere in the Einstein ring. Hence (fi) = |#e and 
using gives 



<AT> iso = 1(01 + 2 f 



(8) 



Equation (|8) is interesting for comparison with non- 
isothermals, but for isothermals themselves, we can do better. 
Combining At = 28e/3 with 9\ - 9 2 = 2/3, which isothermals 
also satisfy, allows us to define 

Af 

AJ iso ee — — (9) 



which equals H Q 1 



ehD 



|Witt et al. (2000, hereafter WMK)| show that Ar is , 
is not restricted to isothermals but is valid for a large family of 
generalized-isothermal lenses, and argue that it will be gener- 
ally applicable in nature. If so, <p could be eliminated altogether. 
We can readily test if this is the case. 
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3. Scaling the data 

We now present the obvious comparison of the scaled time de- 
lays AT = ipH Q 1 with current data. 

Table [2 lists the relevant quantities for the various time- 
delay systems. The time delays references are given in the ta- 
ble, and the other data are taken from the CASTLES survey 
and compilation by Kochane k et al. (19 98) For quads, only the 
first and last images (that is, the longest time delay) are consid- 
ered, to enable a simple comparison with doubles. There are 
some caveats to the values of 9\ and 82: for 1830 and 0218 
the lens-centre is very uncertain and hence 9\ , 82 are especially 
uncertain, for 1608 the lens is apparently an interacting pair 
of galaxies, and 0957 and 091 1 are in clusters and hence have 
large lensing contributions from other galaxies. 

Figure^]shows Ar against Af for the currently known time- 
delay systems. Since error bars on time delays are typically a 
few percent they are not shown here. We notice three things: 

- Whereas Af ranges over a factor of 40, Ar ranges over a 
factor of 5. 

- No correlation is evident between Ar and Af. According to 
the shuffling test described in Appendix A, the trend is sig- 
nificant at the 75% level — i.e., not significant. (Meanwhile, 
Figure[2]shows how Figure^changes if we ignore all red- 
shift information and simply set D = 1. The scatter in- 
creases, but again there is no significant trend.) 

- If we assume that H^ 1 is ~ 15Gyr, then the range of ip 
is 1.5-2 for quads and about 2-6 for doubles. [R. Ibata 
(personal communication) drew attention to this separation 
from an early version of Figure^] 

The various caveats above do not appear to affect these points. 

We can also compare AT KO = H 1 against the data to test 
whether the lenses belong to the generalized isothermal family 
studied by WMK. Figurel^shows ATi so against Af for the same 
systems. We notice the following 

- Ar; so (expected to be constant, since there is no (p factor) 
ranges over a factor of 5. 
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1 10 100 
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Fig. 1. Plot of the scaled time delay AT defined in Equation (|5Jl 
against the observed time delay. The various lenses are labelled 
by their short names: quads are labelled below, doubles above. 

- Larger lenses tend to give lower A7i so , and according to the 
shuffling test, this non-physical trend is significant at the 
95% level. 

- Of the nine lenses, only 1115, 1520 and 1830 even give 
lOGyr < AT lso < 20Gyr, let alone a consistent Ar; so = 

Again we must keep in mind the caveats above, and also that 
the large external shear in 1115, 0957 and 0911 means that 
for these lenses AT^o properly speaking requires a modifica- 
tion given in WMK but disregarded here. On the other hand, it 
seems unlikely that these caveats will solve the serious discrep- 
ancies we see. It appears more likely that most real lenses do 
not belong to the generalized isothermal family. 

Whereas Arise is rejected, are other scalings possible that 
improve upon ATI L.L.R. Williams (personal communica- 
tion) points out that the definition (Equation |5jl of AT con- 
siders the size of the lens but not its asymmetry, and that if 
we multiply (6\ + Qt) 1 in the definition by a further factor of 
V(#i - + Oi) as a measure of asymmetry, then the scaled 

time delays would range over a factor of only 2.5, with no sig- 
nificant trend. But the meaning of such an asymmetry correc- 
tion in terms of lensing theory is not known. 

4. Modelling the range of <p 

From the above, it appears that the scatter in ip reflects a range 
of mass profiles and source positions, and that its value must 
be inferred for each lens by detailed modelling. But without 
going into detailed models for nine lenses, we can at least check 
whether the observed range of ip is plausible. 

Figure@]shows such a check. The main plot is of ip against 
the area (9\ +#2) 2 for an example model (an elliptical isothermal 
potential plus external shear.) The value of ip is shown for dif- 
ferent source positions, the two loops corresponding to source 
positions along the two caustics (actually just inside the caus- 
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Fig. 2. As in Figure but omitting the D factor in the scaled 
time delay. 
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delay in days 



Fig. 3. ri so as defined in Equation ij^Ji against the observed 
time delay. The non-physical trend is significant (see text), and 
hence the generalized isothermal models are rejected. 



tics, to avoid computational problems). Quads are below the 
lower loop, with <p < 2. Doubles are between the two loops, 
with 2 < ip < 6. 2 The values are model-dependent — for ex- 
ample, a steeper model will have both loops somewhat higher. 
Also, the value of (6\ + 6>2) 2 depends on the source position: 
smaller for sources along the long axis of the potential, larger 
for sources perpendicular to that axis. But with these qualifi- 
cations, Figure0]shows that the general ranges of tp, including 
the separation of quads and doubles, is just as it is in the data, 
and there is no evidence that the observed systems come from 
drastically different populations of lenses. 



2 Note that Figure[4]does not show a probability distribution, unlike 
related plots in Oguri et al. (2002) The aim in Figure[4]is simply to 
show the separation of ip for quads and doubles. 
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lens area (arbitrary units) 

Fig. 4. Computation of <p values from a simple model of 
1 1 15+080, taken from Saha & Williams (2003)| The top two 
panels show an image morphology similar to 1115+080, and 
the corresponding source position. The lower panel shows <p 
against (6\ + Oi) 2 for source positions along the two caustics. 
[The horizontal axis is not labelled because (9\ + 62) 1 has ar- 
bitrary units: arcsec 2 , steradians, etc.] The lower loop corre- 
sponds to the diamond caustic and the upper loop corresponds 
to the outer caustic. Hence quads are below the lower loop and 
doubles are between the two loops. 

5. Summary 

We see in this paper a new interpretation of lensing time delays: 
At is H 1 shrunk by the lens's covering factor on the sky, times 
a number of the order of unity. On separating off a redshift 
dependent-term (also of order unity) we are left with a number 
if (say) that summarizes the dependence on details of the lens 
and lens configuration. 

Using these ideas, we can rescale the observed time delays 
for the nine currently-measured systems. The observed time de- 
lays range over a factor of 40, but the rescaled delays range over 
a factor of 5. The latter is the inferred range of <p, and moreover 
it appears that <p < 2 for quads and 2 < <p < 6. Reassuringly, 
the same spread in ip is reproduced by a simple model. 

Using rescaled time-delays we can also test the hypothesis 
that the observed lenses all belong to a generalized-isothermal 
family. This hypothesis is ruled out: it over-predicts time delays 
for large lenses. On the other hand, there is no indication that 
the known time-delay systems come from drastically different 
types of lenses. 



Appendix A: Significance of trends 

In Figures n to |3] we have some points (x,-,y,) and we want 
to know whether there is any trend in the scatter. There are 
many statistical tests relating to the significance of trends in 
data, but none of the standard ones address quite this question. 
However, it is not difficult to design a suitable statistical test. 
Let us pose the question: what is the probability of improving 
the fit to y = constant by shuffling the y,? If nearly all shufflings 
reduce the |slope| we would conclude that the data have a trend. 

In the familiar straight-line fit, the slope is monotonic in 
2,- X(yt. Hence as a statistic, Xiyt is equivalent to the slope. 

In the main text, I use the phrase "significant at the 95% 
level" to mean that 5% of shufflings increase the |slope|. 
Statisticians might use a phrase like "/rvalue of 95%". 

Acknowledgements. I am grateful to Rodrigo Ibata and Liliya 
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